Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add filters

Database
Language
Document Type
Year range
1.
medrxiv; 2022.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2022.06.02.22275926

ABSTRACT

Background: As highlighted by the COVID-19 pandemic, researchers are eager to make use of a wide variety of data sources, both government-sponsored and alternative, to characterize the epidemiology of infectious diseases. To date, few studies have investigated the strengths and limitations of sources currently being used for such research. These are critical for policy makers to understand when interpreting study findings. Methods: To fill this gap in the literature, we compared infectious disease reporting for three diseases (measles, mumps, and varicella) across four different data sources: Optum (health insurance billing claims data), HealthMap (online news surveillance data), Morbidity and Mortality Weekly Reports (official government reports), and National Notifiable Disease Surveillance System (government case surveillance data). We reported the yearly number of national- and state-level disease-specific case counts and disease clusters according to each of our sources during a five-year study period (2013-2017). Findings: Our study demonstrated drastic differences in reported infectious disease incidence across data sources. When compared against the other three sources of interest, Optum data showed substantially higher, implausible standardized case counts for all three diseases. Although there was some concordance in identified state-level case counts and disease clusters, all four sources identified variations in state-level reporting. Interpretation: Researchers should consider data source limitations when attempting to characterize the epidemiology of infectious diseases. Some data sources, such as billing claims data, may be unsuitable for epidemiological research within the infectious disease context.


Subject(s)
COVID-19 , Mastocytosis, Systemic , Communicable Diseases
2.
medrxiv; 2022.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2022.05.18.22275217

ABSTRACT

The potential for bias in non-representative, large-scale, low-cost survey data can limit their utility for population health measurement and public health decision-making. We developed a multi-step regression framework to bias-adjust vaccination coverage predictions from the large-scale US COVID-19 Trends and Impact Survey that included post-stratification to the American Community Survey and secondary normalization to an unbiased reference indicator. As a case study, we applied this framework to generate county-level predictions of long-run vaccination coverage among children ages 5 to 11 years. Our vaccination coverage predictions suggest a low ceiling on long-term national coverage (46%), detect substantial geographic heterogeneity (ranging from 11% to 91% across counties in the US), and highlight widespread disparities in the pace of scale-up in the three months following Emergency Use Authorization of COVID-19 vaccination for 5 to 11 year-olds. Generally, our analysis demonstrates an approach to leverage differing strengths of multiple sources of information to produce estimates on the time-scale and geographic-scale necessary for proactive decision-making. The utility of large-scale, low-cost survey data for improving population health measurement is amplified when these data are combined with other representative sources of data.


Subject(s)
COVID-19
3.
medrxiv; 2021.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2021.04.21.21255862

ABSTRACT

Background & ObjectiveDuring infectious disease outbreaks, health agencies often share text-based information about cases and deaths. This information is usually text-based and rarely machine-readable, thus creating challenges for outbreak researchers. Here, we introduce a generalizable data assembly algorithm that automatically curates text-based, outbreak-related information and demonstrate its performance across three outbreaks. MethodsAfter developing an algorithm with regular expressions, we automatically curated data from health agencies via three information sources: formal reports, email newsletters, and Twitter. A validation data set was also curated manually for each outbreak. FindingsWhen compared against the validation data sets, the overall cumulative missingness and misidentification of the algorithmically curated data were [≤]2% and [≤]1%, respectively, for all three outbreaks. ConclusionsWithin the context of outbreak research, our work successfully addresses the need for generalizable tools that can transform text-based information into machine-readable data across varied information sources and infectious diseases.

SELECTION OF CITATIONS
SEARCH DETAIL